Speaker Diarization of Multi-party Conversations Using Participants Role Information: Political Debates and Professional Meetings
نویسندگان
چکیده
منابع مشابه
Speaker diarization using eye-gaze information in multi-party conversations
We present a novel speaker diarization method by using eyegaze information in multi-party conversations. In real environments, speaker diarization or speech activity detection of each participant of the conversation is challenging because of distant talking and ambient noise. In contrast, eye-gaze information is robust against acoustic degradation, and it is presumed that eyegaze behavior plays...
متن کاملEnhanced speaker diarization with detection of backchannels using eye-gaze information in poster conversations
We propose multi-modal speaker diarization using acoustic and eye-gaze information in poster conversations. Eye-gaze information plays an important role in turn-taking, thus it is useful for predicting speech activity. In this paper, a variety of eyegaze features are elaborated and combined with the acoustic information by the multi-modal integration model. Moreover, we introduce another model ...
متن کاملModeling Topic Control in Conversations using Speaker-centric Nonparametric Topic Models
Identifying influential speakers in multi-party conversations has been the focus of research in communication, sociology, and psychology for decades. It has been long acknowledged qualitatively that controlling the topic of a conversation is a sign of influence. To capture who introduces new topics in conversations, we introduce SITS—Speaker Identity for Topic Segmentation—a nonparametric hiera...
متن کاملRobust Unsupervised Speaker Segmentation for Audio Diarization
Audio diarization Reynolds & Carrasquillo (2005) is the process of partitioning an input audio stream into homogeneous regions according to their specific audio sources. These sources can include audio type (speech, music, background noise, ect.), speaker identity and channel characteristics. With the continually increasing number of larges volumes of spoken documents including broadcasts, voic...
متن کاملMulti-modal recording, analysis and indexing of poster sessions
A new project on multi-modal analysis of poster sessions is introduced. We have designed an environment dedicated to recording of poster conversations using multiple sensors, and collected a number of sessions, to which a variety of multi-modal information is annotated, including utterance units for individual speakers, backchannels, nodding, gazing, and pointing. Automatic speaker diarization,...
متن کامل